--- title: Data Transforms for Hyperspectral Datacubes keywords: fastai sidebar: home_sidebar summary: "Hyperspectral datacubes contain three axes (cross-track, along-track, and wavelength) and are stored in 3D numpy ndarrays. When using a pushbroom scanner, the datacube are filled gradually. The implementation here is based on a circular buffer and we provide additional methods to apply a pipeline of transforms which, for example, can be used for smile correction, and radiance conversion. " description: "Hyperspectral datacubes contain three axes (cross-track, along-track, and wavelength) and are stored in 3D numpy ndarrays. When using a pushbroom scanner, the datacube are filled gradually. The implementation here is based on a circular buffer and we provide additional methods to apply a pipeline of transforms which, for example, can be used for smile correction, and radiance conversion. " nb_path: "00_data.ipynb" ---
{% raw %}
/Users/eway/.pyenv/versions/3.8.3/lib/python3.8/site-packages/pandas/compat/__init__.py:97: UserWarning: Could not import the lzma module. Your installed Python is incomplete. Attempting to use lzma compression will result in a RuntimeError.
  warnings.warn(msg)
/Users/eway/.pyenv/versions/3.8.3/lib/python3.8/site-packages/requests/__init__.py:89: RequestsDependencyWarning: urllib3 (1.26.7) or chardet (3.0.4) doesn't match a supported version!
  warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
{% endraw %}

Generic Circular Buffer on numpy.ndarrays

The base functionality is implemented on a generic circular buffer. The datatype dtype can be modified as desired but the default is set to store int32 digital numbers.

{% raw %}

class CircArrayBuffer[source]

CircArrayBuffer(size:tuple=(100, 100), axis:int=0, dtype:type=int32, show_func:Optional[ndarray]=None)

Circular FIFO Buffer implementation on ndarrays. Each put/get is a (n-1)darray.

{% endraw %} {% raw %}
{% endraw %} {% raw %}

CircArrayBuffer.put[source]

CircArrayBuffer.put(line:ndarray)

Writes a (n-1)darray into the buffer

CircArrayBuffer.get[source]

CircArrayBuffer.get()

Reads the oldest (n-1)darray from the buffer

CircArrayBuffer.show[source]

CircArrayBuffer.show()

Display the data

{% endraw %}

For example, we can write to a 1D array

{% raw %}
cib = CircArrayBuffer(size=(7,),axis=0)
for i in range(9):
    cib.put(i)
    cib.show()

for i in range(9):
    print(i,cib.get())
#(7) [0 0 0 0 0 0 0]
#(7) [0 1 0 0 0 0 0]
#(7) [0 1 2 0 0 0 0]
#(7) [0 1 2 3 0 0 0]
#(7) [0 1 2 3 4 0 0]
#(7) [0 1 2 3 4 5 0]
#(7) [0 1 2 3 4 5 6]
#(7) [7 1 2 3 4 5 6]
#(7) [7 8 2 3 4 5 6]
0 2
1 3
2 4
3 5
4 6
5 7
6 8
7 None
8 None
{% endraw %}

Or a 2D array

{% raw %}
plots_list = []

cib = CircArrayBuffer(size=(4,4),axis=0)
cib.put(1) # scalars are broadcasted to a 1D array
for i in range(5):
    cib.put(cib.get()+1)
    plots_list.append( cib.show().opts(colorbar=True,title=f"i={i}") )

hv.Layout(plots_list).cols(3)
{% endraw %}

Loading Camera Settings and Calibration Files

The OpenHSI camera has a settings dictionary which contains these fields:

  • camera_id is your camera name,
  • row_slice indicates which rows are illuminated and we crop out the rest,
  • resolution is the full pixel resolution given by the camera without cropping, and
  • fwhm_nm specifies the size of spectral bands in nanometers,
  • exposure_ms is the camera exposure time last used,
  • luminance is the reference luminance to convert digital numbers to luminance,
  • longitude is the longitude degrees east,
  • latitude is the latitude degrees north,
  • datetime_str is the UTC time at time of data collection,
  • altitude is the altitude above sea level (assuming target is at sea level) measured in km,
  • radiosonde_station_num is the station number from http://weather.uwyo.edu/upperair/sounding.html,
  • radiosonde_region is the region code from http://weather.uwyo.edu/upperair/sounding.html, and
  • sixs_path is the path to the 6SV executable.
  • binxy number of pixels to bin in (x,y) direction
  • win_offset offsets (x,y) from edge of detector for a selective readout window (used in combination with a win_resolution less than full detector size).
  • win_resolution size of area on detector to readout (width, height)
  • pixel_format format of pixels readout sensor, ie 8bit, 10bit, 12bit

The settings dictionary may also contain additional camera specific fields:

  • mac_addr is GigE camera mac address - used by Lucid Vision Sensors
  • serial_num serial number of detector - used by Ximea and FLIR Sensors

The pickle file is a dictionary with these fields:

  • camera_id is your camera name,
  • HgAr_pic is a picture of a mercury argon lamp's spectral lines for wavelength calibration,
  • flat_field_pic is a picture of a well lit for calculating the illuminated area,
  • smile_shifts is an array of pixel shifts needed to correct for smile error,
  • wavelengths_linear is an array of wavelengths after linear interpolation,
  • wavelengths is an array of wavelengths after cubic interpolation,
  • rad_ref is a 4D datacube with coordinates of cross-track, wavelength, exposure, and luminance,
  • sfit is the spline fit function from the integrating sphere calibration, and
  • rad_fit is the interpolated function of the expected radiance at sensor computed using 6SV.

These files are unique to each OpenHSI camera.

{% raw %}

class CameraProperties[source]

CameraProperties(json_path:str='assets/cam_settings.json', pkl_path:str='assets/cam_calibration.pkl', print_settings=False, **kwargs)

Save and load OpenHSI camera settings and calibration

{% endraw %} {% raw %}
{% endraw %} {% raw %}

CameraProperties.dump[source]

CameraProperties.dump(json_path:str=None, pkl_path:str=None)

Save the settings and calibration files

{% endraw %}

For example, the contents of CameraProperties consists of two dictionaries. To produce the files cam_settings.json and cam_calibration.pkl, follow the steps outlined in the calibration module.

{% raw %}
cam_prop = CameraProperties(pkl_path="")
cam_prop
settings = 
{'camera_id': '0', 'row_slice': [197, 649], 'resolution': [772, 2064], 'fwhm_nm': 4, 'exposure_ms': 10, 'longitude': -17.7, 'latitude': 146.1, 'datetime_str': '2021-05-26 03:26', 'altitude': 0.12, 'radiosonde_station_num': 94299, 'radiosonde_region': 'pac', 'sixs_path': 'assets/6SV1.1/sixsV1.1', 'luminance': 30000}

calibration = 
{}
{% endraw %} {% raw %}
# cam_prop.calibration["rad_ref"]
{% endraw %}

Transforms

We can apply a number of transforms to the camera's raw data and these tranforms are used to modify the processing level during data collection. For example, we can perform a fast smile correction and wavelength binning during operation. With more processing, this is easily extended to obtain radiance and reflectance.

Some transforms require some setup which is done using CameraProperties.tfm_setup. This method also allows one to tack on an additional setup function with the argument more_setup which takes in any callable which can mutate the CameraProperties class.

{% raw %}

CameraProperties.tfm_setup[source]

CameraProperties.tfm_setup(more_setup:Optional[CameraProperties]=None, dtype:Union[int32, float32]=int32)

Setup for transforms

{% endraw %} {% raw %}
{% endraw %} {% raw %}

CameraProperties.crop[source]

CameraProperties.crop(x:ndarray)

Crops to illuminated area

{% endraw %} {% raw %}

CameraProperties.fast_smile[source]

CameraProperties.fast_smile(x:ndarray)

Apply the fast smile correction procedure

{% endraw %} {% raw %}
{% endraw %} {% raw %}

CameraProperties.fast_bin[source]

CameraProperties.fast_bin(x:ndarray)

Changes the view of the datacube so that everything that needs to be binned is in the last axis. The last axis is then binned.

{% endraw %} {% raw %}

CameraProperties.slow_bin[source]

CameraProperties.slow_bin(x:ndarray)

Bins spectral bands accounting for the slight nonlinearity in the index-wavelength map

{% endraw %} {% raw %}
{% endraw %} {% raw %}

CameraProperties.dn2rad[source]

CameraProperties.dn2rad(x:Array'>[ForwardRef('λ,x'), int32])

Converts digital numbers to radiance (uW/cm^2/sr/nm). Use after cropping to useable area.

{% endraw %} {% raw %}

CameraProperties.rad2ref_6SV[source]

CameraProperties.rad2ref_6SV(x:Array'>[ForwardRef('λ,x'), float32])

{% endraw %} {% raw %}
{% endraw %} {% raw %}

CameraProperties.set_processing_lvl[source]

CameraProperties.set_processing_lvl(lvl:int=2, custom_tfms:List[Callable[ndarray, ndarray]]=None)

Define the output lvl of the transform pipeline. 0 : raw digital numbers cropped to useable sensor area 1 : case 0 + fast smile correction 2 : case 1 + fast binning (default) 3 : case 1 + slow binning 4 : case 2 + conversion to radiance in units of uW/cm^2/sr/nm 5 : case 4 except radiance conversion moved to 2nd step 6 : case 4 + conversion to reflectance 7 : smile corrected and binned -> radiance 8 : case 7 + converted to reflectance

{% endraw %} {% raw %}
{% endraw %}

You can add your own tranform by monkey patching the CameraProperties class.

@patch
def identity(self:CameraProperties, x:np.ndarray) -> np.ndarray:
    """The identity tranform"""
    return x

If you don't require any camera settings or calibration files, a valid transform can be any Callable that takens in a 2D np.ndarray and returns a 2D np.ndarray.

Pipeline for Composing Transforms

Depending on the level of processing that one wants to do real-time, a number of transforms need to be composed in sequential order. To make this easy to customise, you can use the pipeline method and pass in a raw camera frame and an ordered list of transforms.

To make the transforms pipeline easy to use and customise, you can use the CameraProperties.set_processing_lvl method.

{% raw %}

CameraProperties.pipeline[source]

CameraProperties.pipeline(x:ndarray)

Compose a list of transforms and apply to x.

{% endraw %} {% raw %}
{% endraw %} {% raw %}
cam_prop = CameraProperties("assets/cam_settings.json","assets/cam_calibration.pkl")

cam_prop.set_processing_lvl(-1) # raw digital numbers
test_eq( (772,2064), np.shape( cam_prop.pipeline(cam_prop.calibration["HgAr_pic"])) )

cam_prop.set_processing_lvl(0) # cropped
test_eq( (452,2064), np.shape( cam_prop.pipeline(cam_prop.calibration["HgAr_pic"])) )

cam_prop.set_processing_lvl(1) # fast smile corrected
test_eq( (452,2055), np.shape( cam_prop.pipeline(cam_prop.calibration["HgAr_pic"])) )

cam_prop.set_processing_lvl(2) # fast binned
test_eq( (452,108),  np.shape( cam_prop.pipeline(cam_prop.calibration["HgAr_pic"])) )

cam_prop.set_processing_lvl(4) # radiance
test_eq( (452,108),  np.shape( cam_prop.pipeline(cam_prop.calibration["HgAr_pic"])) )

cam_prop.set_processing_lvl(6) # reflectance
test_eq( (452,108),  np.shape( cam_prop.pipeline(cam_prop.calibration["HgAr_pic"])) )

cam_prop.set_processing_lvl(5) # radiance conversion moved earlier in pipeline
test_eq( (452,108),  np.shape( cam_prop.pipeline(cam_prop.calibration["HgAr_pic"])) )

cam_prop.set_processing_lvl(3) # slow binned
test_eq( (452,108),  np.shape( cam_prop.pipeline(cam_prop.calibration["HgAr_pic"])) )
{% endraw %}

Buffer for Data Collection

DataCube takes a line with coordinates of wavelength (x-axis) against cross-track (y-axis), and stores the smile corrected version in its CircArrayBuffer.

For facilitate near real-timing processing, a fast smile correction procedure is used. An option to use a fast binning procedure is also available. When using these two procedures, the overhead is roughly 2 ms on a Jetson board.

Instead of preallocating another buffer for another data collect, one can use the circular nature of the DataCube and use the internal buffer again without modification - just use DataCube.push like normal.

Storage Allocation

All data buffers are preallocated so it's no secret that hyperspectral datacubes are memory hungry. For reference:

along-track pixels wavelength binning RAM needed time to collect at 10 ms exposure time to save to SSD
4096 4 nm ≈ 800 MB ≈ 55 s ≈ 3 s
1024 no binning ≈ 4 GB ≈ 14 s ≈ 15 s

In reality, it is very difficult to work with raw data without binning due to massive RAM usage and extended times to save the NetCDF file to disk which hinders making real-time analysis. The frame rate (at 10 s exposure) with binning drops the frame rate to from 90 fps to 75 fps. In our experimentation, using a SSD mounted into a M.2 slot on a Jetson board provided the fastest experience. When using other development boards such as a Raspberry Pi 4, the USB 3.0 port is recommended over the USB 2.0 port.

{% raw %}

class DateTimeBuffer[source]

DateTimeBuffer(n:int=16)

Records timestamps in UTC time.

{% endraw %} {% raw %}
{% endraw %} {% raw %}

DateTimeBuffer.update[source]

DateTimeBuffer.update()

Stores current UTC time in an internal buffer when this method is called.

{% endraw %} {% raw %}
timestamps = DateTimeBuffer(n=8)
for i in range(8):
    timestamps.update()
    
timestamps.data
array([datetime.datetime(2021, 12, 13, 4, 13, 33, 140562, tzinfo=datetime.timezone.utc),
       datetime.datetime(2021, 12, 13, 4, 13, 33, 140585, tzinfo=datetime.timezone.utc),
       datetime.datetime(2021, 12, 13, 4, 13, 33, 140591, tzinfo=datetime.timezone.utc),
       datetime.datetime(2021, 12, 13, 4, 13, 33, 140596, tzinfo=datetime.timezone.utc),
       datetime.datetime(2021, 12, 13, 4, 13, 33, 140600, tzinfo=datetime.timezone.utc),
       datetime.datetime(2021, 12, 13, 4, 13, 33, 140605, tzinfo=datetime.timezone.utc),
       datetime.datetime(2021, 12, 13, 4, 13, 33, 140609, tzinfo=datetime.timezone.utc),
       datetime.datetime(2021, 12, 13, 4, 13, 33, 140613, tzinfo=datetime.timezone.utc)],
      dtype=object)
{% endraw %} {% raw %}

class DataCube[source]

DataCube(n_lines:int=16, processing_lvl:int=2, json_path:str='assets/cam_settings.json', pkl_path:str='assets/cam_calibration.pkl', print_settings=False) :: CameraProperties

docstring.

{% endraw %} {% raw %}
{% endraw %} {% raw %}

DataCube.put[source]

DataCube.put(x:ndarray)

Applies the composed tranforms and writes the 2D array into the data cube. Stores a timestamp for each push.

{% endraw %} {% raw %}

DataCube.save[source]

DataCube.save(save_dir:str, preconfig_meta_path:str=None, prefix:str='', suffix:str='')

Saves to a NetCDF file (and RGB representation) to directory dir_path in folder given by date with file name given by UTC time.

{% endraw %} {% raw %}
{% endraw %}

For example, if this data was collected on 2021/10/03 at 12:18:30.010568 UTC, then there will be two files saved under the directory 2021_10_03. The two files are the NetCDF file 2021_10_03-12_18_30.nc and PNG image 2021_10_03-12_18_30.png. You can also had a custom prefix and suffix to the file name.

{% raw %}

DataCube.load_nc[source]

DataCube.load_nc(nc_path:str, old_style:bool=False)

Lazy load a NetCDF datacube into the DataCube buffer.

{% endraw %} {% raw %}
{% endraw %}

The plot_lib argument hs Choose matplotlib if you want to use Choose bokeh if you want to compose plots together and use interactive tools.

{% raw %}

DataCube.show[source]

DataCube.show(plot_lib:str='bokeh', red_nm:float=640.0, green_nm:float=550.0, blue_nm:float=470.0, robust:bool=False, hist_eq:bool=False)

Generate a histogram equalised RGB plot from chosen RGB wavelengths. The plotting backend can be specified by plot_lib and can be "bokeh" or "matplotlib".

{% endraw %} {% raw %}
{% endraw %} {% raw %}
n = 256

dc = DataCube(n_lines=n,processing_lvl=2,json_path="assets/cam_settings.json",pkl_path="assets/cam_calibration.pkl")

for i in range(200):
    dc.put( np.random.randint(0,255,dc.settings["resolution"]) )

dc.show("bokeh")
{% endraw %} {% raw %}
dc.save("assets")
<ipython-input-25-44dff74a7324>:26: DeprecationWarning: parsing timezone aware datetimes is deprecated; this will raise an error in the future
  time=(["time"],self.timestamps.data.astype(np.datetime64)))
WARNING:param.main: Option 'width' for RGB type not valid for selected backend ('matplotlib'). Option only applies to following backends: ['bokeh']
WARNING:param.main: Option 'height' for RGB type not valid for selected backend ('matplotlib'). Option only applies to following backends: ['bokeh']
{% endraw %} {% raw %}
# remove the directory and file just created so it doesn't clog up the repo after running tests
import shutil, os
shutil.rmtree("assets/"+[ f for f in os.listdir("assets/") if "20" in f and "_" in f ][0])
{% endraw %}